biological pathway
Biological Pathway Informed Models with Graph Attention Networks (GATs)
Wong, Gavin, Ho, Ping Shu, Yeung, Ivan Au, Cheung, Ka Chun, See, Simon
Biological pathways map gene-gene interactions that govern all human processes. Despite their importance, most ML models treat genes as unstructured tokens, discarding known pathway structure. The latest pathway-informed models capture pathway-pathway interactions, but still treat each pathway as a "bag of genes" via MLPs, discarding its topology and gene-gene interactions. We propose a Graph Attention Network (GAT) framework that models pathways at the gene level. We show that GATs generalize much better than MLPs, achieving an 81% reduction in MSE when predicting pathway dynamics under unseen treatment conditions. We further validate the correctness of our biological prior by encoding drug mechanisms via edge interventions, boosting model robustness. Finally, we show that our GAT model is able to correctly rediscover all five gene-gene interactions in the canonical TP53-MDM2-MDM4 feedback loop from raw time-series mRNA data, demonstrating potential to generate novel biological hypotheses directly from experimental data.
Sparsity is All You Need: Rethinking Biological Pathway-Informed Approaches in Deep Learning
Caranzano, Isabella, Pancotti, Corrado, Rollo, Cesare, Sartori, Flavio, Liò, Pietro, Fariselli, Piero, Sanavia, Tiziana
Sparsity is All You Need: Rethinking Biological Pathway-Informed Approaches in Deep Learning Isabella Caranzano 1, Corrado Pancotti 1, Cesare Rollo 1, Flavio Sartori 1, Pietro Liò 2, Piero Fariselli 1, Tiziana Sanavia 1 1 Computational Biomedicine Unit, Department of Medical Sciences, University of Torino, Torino, Italy 2 Department of Computer Science and Technology, University of Cambridge, Cambridge, UK Abstract Biologically-informed neural networks typically leverage pathway annotations to enhance performance in biomedical applications. We hypothesized that the benefits of pathway integration does not arise from its biological relevance, but rather from the sparsity it introduces. We conducted a comprehensive analysis of all relevant pathway-based neural network models for predictive tasks, critically evaluating each study's contributions. From this review, we curated a subset of methods for which the source code was publicly available. The comparison of the biologically informed state-of-the-art deep learning models and their randomized counterparts showed that models based on randomized information performed equally well as biologically informed ones across different metrics and datasets. Notably, in 3 out of the 15 analyzed models, the randomized versions even outperformed their biologically informed counterparts. Moreover, pathway-informed models did not show any clear advantage in interpretability, as randomized models were still able to identify relevant disease biomarkers despite lacking explicit pathway information. Our findings suggest that pathway annotations may be too noisy or inadequately explored by current methods. Therefore, we propose a methodology that can be applied to different domains and can serve as a robust benchmark for systematically comparing novel pathway-informed models against their randomized counterparts. This approach enables researchers to rigorously determine whether observed performance improvements can be attributed to biological insights. Background & Summary When dealing with deep learning models, many functions that are efficiently computable through a machine learning approach exhibit what is called "compositional sparsity", meaning that they can be decomposed into a few simpler functions, each depending on only a arXiv:2505.04300v1 Deep networks, such as Convolutional Neural Networks (CNNs) and Transformers, align with the compositional structure of many target functions, leading to better generalization since they approximate such functions efficiently without falling victim to the "curse of dimensionality", i.e. the exponential growth of computational complexity with input dimension [37, 12, 31, 13, 32]. This compositional sparsity can be further enhanced by introducing prior constraints on features, such as grouping features into concepts or modelling interactions among them.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.54)
- Europe > Italy > Piedmont > Turin Province > Turin (0.24)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.68)
BioMaze: Benchmarking and Enhancing Large Language Models for Biological Pathway Reasoning
Zhao, Haiteng, Ma, Chang, Xu, Fangzhi, Kong, Lingpeng, Deng, Zhi-Hong
The applications of large language models (LLMs) in various biological domains have been explored recently, but their reasoning ability in complex biological systems, such as pathways, remains underexplored, which is crucial for predicting biological phenomena, formulating hypotheses, and designing experiments. This work explores the potential of LLMs in pathway reasoning. We introduce BioMaze, a dataset with 5.1K complex pathway problems derived from real research, covering various biological contexts including natural dynamic changes, disturbances, additional intervention conditions, and multi-scale research targets. Our evaluation of methods such as CoT and graph-augmented reasoning, shows that LLMs struggle with pathway reasoning, especially in perturbed systems. To address this, we propose PathSeeker, an LLM agent that enhances reasoning through interactive subgraph-based navigation, enabling a more effective approach to handling the complexities of biological systems in a scientifically aligned manner. The dataset and code are available at https://github.com/zhao-ht/BioMaze.
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- Asia > China > Shaanxi Province > Xi'an (0.04)
- (2 more...)
Pathology-genomic fusion via biologically informed cross-modality graph learning for survival analysis
Zhang, Zeyu, Zhao, Yuanshen, Duan, Jingxian, Liu, Yaou, Zheng, Hairong, Liang, Dong, Zhang, Zhenyu, Li, Zhi-Cheng
The diagnosis and prognosis of cancer are typically based on multi-modal clinical data, including histology images and genomic data, due to the complex pathogenesis and high heterogeneity. Despite the advancements in digital pathology and high-throughput genome sequencing, establishing effective multi-modal fusion models for survival prediction and revealing the potential association between histopathology and transcriptomics remains challenging. In this paper, we propose Pathology-Genome Heterogeneous Graph (PGHG) that integrates whole slide images (WSI) and bulk RNA-Seq expression data with heterogeneous graph neural network for cancer survival analysis. The PGHG consists of biological knowledge-guided representation learning network and pathology-genome heterogeneous graph. The representation learning network utilizes the biological prior knowledge of intra-modal and inter-modal data associations to guide the feature extraction. The node features of each modality are updated through attention-based graph learning strategy. Unimodal features and bi-modal fused features are extracted via attention pooling module and then used for survival prediction. We evaluate the model on low-grade gliomas, glioblastoma, and kidney renal papillary cell carcinoma datasets from the Cancer Genome Atlas (TCGA) and the First Affiliated Hospital of Zhengzhou University (FAHZU). Extensive experimental results demonstrate that the proposed method outperforms both unimodal and other multi-modal fusion models. For demonstrating the model interpretability, we also visualize the attention heatmap of pathological images and utilize integrated gradient algorithm to identify important tissue structure, biological pathways and key genes.
- Asia > China > Henan Province > Zhengzhou (0.24)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Generating Drug Repurposing Hypotheses through the Combination of Disease-Specific Hypergraphs
Jain, Ayush, Laure-Charpignon, Marie, Chen, Irene Y., Philippakis, Anthony, Alaa, Ahmed
The drug development pipeline for a new compound can last 10-20 years and cost over 10 billion. Drug repurposing offers a more time- and cost-effective alternative. Computational approaches based on biomedical knowledge graph representations have recently yielded new drug repurposing hypotheses. In this study, we present a novel, disease-specific hypergraph representation learning technique to derive contextual embeddings of biological pathways of various lengths but that all start at any given drug and all end at the disease of interest. Further, we extend this method to multi-disease hypergraphs. To determine the repurposing potential of each of the 1,522 drugs, we derive drug-specific distributions of cosine similarity values and ultimately consider the median for ranking. Cosine similarity values are computed between (1) all biological pathways starting at the considered drug and ending at the disease of interest and (2) all biological pathways starting at drugs currently prescribed against that disease and ending at the disease of interest. We illustrate our approach with Alzheimer's disease (AD) and two of its risk factors: hypertension (HTN) and type 2 diabetes (T2D). We compare each drug's rank across four hypergraph settings (single- or multi-disease): AD only, AD + HTN, AD + T2D, and AD + HTN + T2D. Notably, our framework led to the identification of two promising drugs whose repurposing potential was significantly higher in hypergraphs combining two diseases: dapagliflozin (antidiabetic; moved up, from top 32$\%$ to top 7$\%$, across all considered drugs) and debrisoquine (antihypertensive; moved up, from top 76$\%$ to top 23$\%$). Our approach serves as a hypothesis generation tool, to be paired with a validation pipeline relying on laboratory experiments and semi-automated parsing of the biomedical literature.
- North America > United States > California > San Francisco County > San Francisco (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Reusability report: Prostate cancer stratification with diverse biologically-informed neural architectures
Pedersen, Christian, Tesileanu, Tiberiu, Wu, Tinghui, Golkar, Siavash, Cranmer, Miles, Zhang, Zijun, Ho, Shirley
In Elmarakeby et al., "Biologically informed deep neural network for prostate cancer discovery", a feedforward neural network with biologically informed, sparse connections (P-NET) was presented to model the state of prostate cancer. We verified the reproducibility of the study conducted by Elmarakeby et al., using both their original codebase, and our own re-implementation using more up-to-date libraries. We quantified the contribution of network sparsification by Reactome biological pathways, and confirmed its importance to P-NET's superior performance. Furthermore, we explored alternative neural architectures and approaches to incorporating biological information into the networks. We experimented with three types of graph neural networks on the same training data, and investigated the clinical prediction agreement between different models. Our analyses demonstrated that deep neural networks with distinct architectures make incorrect predictions for individual patient that are persistent across different initializations of a specific neural architecture. This suggests that different neural architectures are sensitive to different aspects of the data, an important yet under-explored challenge for clinical prediction tasks.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.15)
- North America > United States > New York (0.06)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
How is Healthcare advancing AI in 2019 & 2020
In recent years, artificial intelligence (AI) has become a hot topic, largely due to its potential to transform the ability of computers to solve increasingly complex problems in technology and society. Machines that are able to learn and "think" like a human brain offer great potential in advancing science and innovation by evaluating complicated scenarios in a fraction of the time it would take a person. While AI is still in its early stages, the biotech industry is already leveraging AI tools to accelerate drug discovery and advance health research. Worldwide, health research is occurring at a larger volume than any time in human history. Thousands of peer-reviewed articles are generated every month – a single person cannot keep up with the constant surge of new information, let alone quickly process and integrate it into their existing knowledge base.
How is AI Being Used to Advance Healthcare in 2019 & 2020
Machines that are able to learn and "think" like a human brain offer great potential in advancing science and innovation by evaluating complicated scenarios in a fraction of the time it would take a person. While AI is still in its early stages, the biotech industry is already leveraging AI tools to accelerate drug discovery and advance health research. Worldwide, health research is occurring at a larger volume than any time in human history. Thousands of peer-reviewed articles are generated every month – a single person cannot keep up with the constant surge of new information, let alone quickly process and integrate it into their existing knowledge base. Many biotech researchers are now using AI to manage the onslaught of data and make sure no meaningful pieces slip through the cracks.
Encoding Higher Level Extensions of Petri Nets in Answer Set Programming
Anwar, Saadat, Baral, Chitta, Inoue, Katsumi
Answering realistic questions about biological systems and pathways similar to the ones used by text books to test understanding of students about biological systems is one of our long term research goals. Often these questions require simulation based reasoning. To answer such questions, we need formalisms to build pathway models, add extensions, simulate, and reason with them. We chose Petri Nets and Answer Set Programming (ASP) as suitable formalisms, since Petri Net models are similar to biological pathway diagrams; and ASP provides easy extension and strong reasoning abilities. We found that certain aspects of biological pathways, such as locations and substance types, cannot be represented succinctly using regular Petri Nets. As a result, we need higher level constructs like colored tokens. In this paper, we show how Petri Nets with colored tokens can be encoded in ASP in an intuitive manner, how additional Petri Net extensions can be added by making small code changes, and how this work furthers our long term research goals. Our approach can be adapted to other domains with similar modeling needs.
- North America > United States > Arizona > Maricopa County > Tempe (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
- (3 more...)